rank | frequency | n-gram |
---|---|---|
1 | 2303 | -i |
2 | 1855 | -a |
3 | 1296 | -u |
4 | 1155 | -ã |
5 | 765 | -e |
rank | frequency | n-gram |
---|---|---|
1 | 410 | -lu |
2 | 400 | -ea |
3 | 318 | -ia |
4 | 308 | -li |
5 | 298 | -ri |
rank | frequency | n-gram |
---|---|---|
1 | 167 | -lji |
2 | 138 | -ari |
3 | 133 | -ili |
4 | 125 | -lui |
5 | 106 | -scã |
rank | frequency | n-gram |
---|---|---|
1 | 94 | -ljei |
2 | 84 | -ascã |
3 | 70 | -escu |
4 | 67 | -shti |
5 | 65 | -area |
rank | frequency | n-gram |
---|---|---|
1 | 74 | -eascã |
2 | 61 | -ashti |
3 | 44 | -iljei |
4 | 36 | -ãljei |
5 | 32 | -njlji |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings